Changes in anxiety in the general population over a six-year period

Background Anxiety is a frequent condition in patients and in the general population. The aim of this study was to investigate changes in anxiety over time and to test several psychometric properties of the Generalized Anxiety Disorder Screener (GAD-7) from a longitudinal perspective. Methods The GAD-7 was included in an examination with two waves, six years apart. The study sample (n = 5355) was comprised of representatively selected adults from the general population with a mean age of 57.3 (SD = 12.3) years. Results During the 6-year time interval, anxiety increased significantly from 3.28 ± 3.16 (t1) to 3.66 ± 3.46 (t2). Confirmatory factor analyses proved the longitudinal measurement invariance of the GAD-7. Reliability of the GAD-7 was established both for the cross-sectional and the longitudinal perspective. The test-retest correlation was r = 0.53, and there were no substantial sex or age differences in these coefficients of temporal stability. The mean changes in anxiety were similar for males and females, and there was no linear age trend in the changes measured by the GAD-7. Changes in anxiety over the 6-year period were correlated with changes in satisfaction with life (r = -0.30), bodily complaints (r = 0.31), and the mental component of quality of life (r = -0.48). Conclusion The GAD-7 is a suitable instrument for measuring changes in anxiety. Age and gender have only minor significance when interpreting change scores.


Introduction
Anxiety disorders are common in patients and in the general population [1].In primary care settings, anxiety disorders are among the most frequent disorders observed [2][3][4], and these disorders are associated with high use of health care services [5].Sex and age differences in anxiety have been investigated in several studies.Females generally report higher degrees of anxiety than men do [6][7][8], while age differences in anxiety are less clear.Most studies found nonlinear and unsystematic effects of age on anxiety [7,9,10].
While there are multiple cross-sectional studies on anxiety with samples of the general population, longitudinal studies are rare.However, to interpret changes in anxiety over time in patients, it is relevant to know which changes occur in the general population.A related question concerns the temporal stability of anxiety.A compilation of several studies on test-retest correlations of anxiety and other variables of mental health [11] showed coefficients between 0.55 and 0.70, with decreasing coefficients with increasing temporal distance between the measurements.However, the question of how anxiety's temporal stability of anxiety depends on sociodemographic factors, e.g., whether women are more or less constant in their anxiety level than men, and how the stability of anxiety depends on age or socioeconomic level, had yet to be systematically tested.
Multiple studies have investigated the (cross-sectional) association between anxiety and other variables such as depression [12], physical complaints [13], fatigue [14,15], fear of cancer progression [16] and COVID-19 risk perception [17].Such studies are useful for clarifying the partial overlap between related constructs and symptoms.However, from a longitudinal perspective it is also of interest whether changes in anxiety correspond with changes in those variables.There are only a few studies that have investigated the associations between change scores of mental health variables [18].
The GAD-7 [19] is a frequently used tool for measuring generalized anxiety.This questionnaire has been translated into multiple languages and has been validated for several clinical groups.Normative values are available [6], and multiple studies have proved the psychometric quality of the GAD-7 [6,[20][21][22][23] from a cross-sectional perspective.A further issue of the longitudinal psychometric quality of a questionnaire is reliability of change.The common reliability in terms of Cronbach's alpha is high when the items of the questionnaire are highly correlated with each other.This cross-sectional view on reliability can be applied to the longitudinal analysis: To what degree are the changes of the items intercorrelated, and how do the changes in the items contribute to the total change score?This type of reliability analysis will also be presented here.
The GAD-2 is an ultra-short form of the GAD-7.It consists of the two psychometrically most reliable items of the GAD-7 [12,24].Because there is a need for very brief instruments that can be effectively used in epidemiological research, it is also relevant to test the cross-sectional and longitudinal psychometric properties of this ultra-short instrument.
Measurement invariance of the GAD-7 across sex and age has been tested in several crosssectional studies with samples of the general population and clinical samples [6,7,25,26], and longitudinal measurement invariance across several time points has been examined in certain groups of patients [27,28].Such analyses had yet to be performed with samples of the general population though.
Based on the data of this follow-up study, the aims of this paper were (a) to analyze changes in anxiety during a 6-year interval, (b) to test psychometric properties of the GAD-7 and the GAD-2 including coefficients of temporal stability, reliability of change, and measurement invariance, (c) to analyze sex and age differences in the changes of anxiety, and (d) to examine the associations between anxiety and other variables (quality of life, bodily complaints, life satisfaction, habitual optimism, and social support) both from a cross-sectional and from a longitudinal perspective.

Sample
The LIFE-Adult-Study of the Leipzig Center for Civilization Diseases (LIFE) is a populationbased study with a representative sample (n = 10000) of people living in the city of Leipzig, Germany.The first wave of this study was conducted between 2011 and 2014.The local residents' registration office generated an age-and gender-stratified random selection of inhabitants, ranging in age from 18 to 80 years.According to the study protocol, the focus was on the age range 40-80 years.At the study center, the study participants underwent a sequence of assessments, including collection of their sociodemographic data, behavioral and lifestyle factors, medical history, and several medical examinations.Details of the study design have been published elsewhere [29].
Between 2017 and 2021, all participants of the first wave (t1) who could be reached were invited to attend a follow-up assessment (t2).Those participants who were able and willing to take part in the follow-up examination were sent a letter with multiple questions and questionnaires per mail.The GAD-7 was used for both the t1 and the t2 assessment.Both the baseline and the follow-up study have been approved by the Ethics Committee of the University of Leipzig (approval numbers 263-2009-14122009, 263/09-ff, and 201/17-ek).Written informed consent was obtained by all participants.Results of the baseline assessment of this study regarding the GAD-7 have already been published [7].The present article further adds the results of the longitudinal analyses.

Instruments
The GAD-7 [19] is a one-dimensional questionnaire designed to detect symptoms of generalized anxiety disorder according to the DSM-IV.The item scores range from 0 (not at all) to 3 (nearly every day), resulting in sum scores that range from 0 to 21.The GAD-2 is an ultrashort form of the GAD-7 [30] consisting of only two items.According to a recent study on sensitivity to change of the GAD-7 [23], we used change scores of 4 or greater to reflect a clinically important difference for individuals.
In addition to the GAD-7, the following instruments were included both at baseline and at follow-up: The Satisfaction With Life Scale SWLS [31] (general life satisfaction), the Short Form Health Survey-8 SF-8 [32] (quality of life), the Patient Health Questionnaire-15 PHQ-15 [33] (bodily complaints), the Life Orientation Test LOT-R [34] (dispositional optimism), and the ENRICHD Social Support Instrument [35] (social support).
Sociodemographic factors were obtained in a structured interview.Socioeconomic status (SES) was calculated in accordance with the Robert-Koch-Institute [36], integrating education, income, and professional position into one index.For the regression analyses, socioeconomic status was categorized into three strata.

Statistical analysis
Mean score differences between two groups of participants were expressed with effect sizes (Cohen's d), relating the mean score differences to the pooled standard deviation.Cronbach's alpha coefficient was used to determine internal consistency.Coefficients of temporal stability were calculated with Pearson's correlation coefficients.Some researchers prefer to use intraclass correlations, however, most of the research on temporal stability in the literature uses simple correlation coefficients.Thus, to enhance comparability with these studies, we also use these more common Pearson correlation coefficients.
To test the psychometric properties of the single items, we used the common discriminatory power coefficients that indicate the correlation between an item and the part-whole-corrected sum score.In addition to that, we performed discriminatory power analyses with the change scores.These coefficients indicate to what degree the change in a single item corresponds with the change of the scale after removing the item of interest.
For establishing measurement invariance, confirmatory factor analysis (CFA) models were estimated with the diagonally weighted least squares (DWLS) method with mean-and variance-adjusted test statistics.Model fit was judged using a combinational rule of comparative fit index (CFI) and standardized root mean square residual (SRMR) [37].Based on this rule, poor fit was indicated if both CFI and SRMR exceed the threshold for acceptability, i.e., CFI < 0.95 and SRMR > 0.06.We also present the Tucker-Lewis index (TLI) and the root mean square error of approximation (RMSEA).Differences in model fit are expressed by the difference of CFI values (∆CFI) between sequential models.A difference of at least 0.002 is considered a substantial change in model fit, and smaller differences are regarded as being negligible.
First, we tested the model for each time point (t1 and t2) separately.Then, an unconstrained model in which both time points were combined served as the baseline model.Acceptable fit of this model indicates configural invariance, i.e., the factor patterns remain constant over time (configural invariance).From this model, the detection of a violation of measurement invariance starts, using a forward approach.Parameters were constrained set by set, and released if necessary, in the following order: thresholds and weights (metric or weak invariance), then additionally intercepts (scalar or strong invariance), and finally residuals (full or strict invariance) [38].
If, as a result of constraining a set of parameters, the model fit decreased substantially compared to the model before, a search for partial invariance was executed, otherwise the next set of parameters was constrained.If the search for partial invariance identified a parameter that should not be constrained to equality between occasions, the constraint was released only if its release also lead to a substantial increase in model fit.If no further misspecified constraint could be identified, the next set of parameters were constrained.CFAs and measurement invariance analyses were calculated with R, version 4.1.1[39] with the packages lavaan 0.6.9 and semTools 0.5.5 [40].All other statistics were performed with SPSS version 27.

Sample characteristics
Of the 10000 participants in the baseline examination, 9751 persons provided valid GAD-7 data.Characteristics of that sample have been published previously [7].The response rate of the baseline examination was 33%.Using the criterion of at least six valid GAD-7 items at both t1 and t2, the final sample consisted of 5355 individuals, resulting in a final response rate of 18%.Sociodemographic characteristics of the participants are given in Table 1.The mean time interval between the t1 and the t2 examination was 6.04 years (SD = 0.42 years).

Anxiety mean scores and item characteristics
Table 2 shows that anxiety increased during the 6-year period from 3.28 to 3.66.This difference (d = 0.11) is statistically significant with p < 0.001.Using the cut-off of 10 or higher for a heightened GAD-7 score [6], 90.8% of the sample had normal anxiety scores both at t1 and t2, 2.7% showed heightened anxiety only at t1, 4.7% only at t2, and 1.8% at both time points.
The GAD-7 mean score of those individuals who participated at t1 but not at t2 (drop outs, n = 4396) was 3.91 ± 3.60, significantly higher (p < 0.001) than the t1 mean score of those who also attended the t2 examination (M = 3.28 ± 3.16).
Table 2 shows that all items except item I6 (being easily annoyed or irritable) contributed to the increase in anxiety from t1 to t2, with effect sizes between 0.03 and 0.18.The main contributors to this increase were items 1 (feeling nervous) (d = 0.18) and item 7 (feeling afraid) (d = 0.17).
All items contributed to the GAD-7 total score, both at t1 and t2, with discrimination power coefficients between 0.49 and 0.70.The column "discrimination power (Δitem, Δscale)" shows that all item changes from t1 to t2 contributed to the change of the GAD-7 total score with somewhat lower but nevertheless positive coefficients (between 0.36 and 0.53).
Table 2 also shows that the test-retest correlation of the GAD-7 sum score was 0.53, and that the test-retest correlations of the single items ranged from 0.34 to 0.45.The correlation between the GAD-7 scores and the GAD-2 scores were 0.87 (at t1) and 0.88 (at t2), and the correlation between the GAD-7 change (t2 minus t1) score and the GAD-2 change score was 0.80.

Measurement invariance
The results of the measurement invariance analyses are presented in Table 3.When t1 and t2 were analyzed separately, CFA results indicated a good model fit for both measurement points.Taken both measurement points together in one model, full invariance could be established.

The impact of sociodemographic variables on the course of anxiety
Table 4 presents mean score differences between certain groups of participants.Females were more anxious than males at t1 and at t2, and the increase in anxiety from t1 to t2 was slightly and insignificantly higher in females (Δ = 0.40) than in males (Δ = 0.35).All age groups except for the age group 60-69 years showed an increase in anxiety.While SES was negatively associated with anxiety in the cross-sectional perspective, there was no such linear relationship for the changes in anxiety; the difference scores were between 0.26 and 0.42 for the three SES groups.
Based on the criterion of a clinically meaningful change in anxiety being four or more points [23], 428 participants (8.0%) showed relevant decrease in anxiety, 4249 (79.3%) showed no relevant change, and 678 (12.7%) showed a relevant increase in anxiety.The corresponding proportions, broken down by sex, age group, and SES group, are also given in Table 4.
Regarding the temporal stability r tt , Table 4 shows that there were nearly no sex differences (coefficients of 0.53 and 0.52 for males and females, respectively).Concerning age, the highest stability was found for the age group 50-59 years (r tt = 0.57).

Correlations with other psychological or QoL variables
The highest cross-sectional correlations of the GAD-7 were found for the mental health component of the SF-8 (r = 0.68 at t1and r = 0.66 at t2).The comparison between the correlations at t1 with those at t2 indicates only small differences between the measurement points with the exception of the LOT-R correlations, which were somewhat higher at t2.
The last column of Table 5 presents the (longitudinal) correlations between the changes (increases or decreases from t1 to t2) of the GAD-7 score with the changes of the other scales.All of these correlations are smaller than the corresponding cross-sectional correlations, but the sequence of the correlations is very similar to that of the raw scores: Variables with high cross-sectional associations such as Mental Health also show relatively high associations between the change scores.To illustrate the association between GAD-7 change and the change in the other variables using the PHQ-15 as an example, we calculated the mean change scores of the PHQ-15 (t2 score minus t1 score) for each of the three GAD-7 change categories.The results were: ΔPHQ-15 = -1.12± 3.27 for the group with meaningful decrease in anxiety, ΔPHQ-15 = 0.44 ± 2.91 for the group with no meaningful change in anxiety, and ΔPHQ-15 = 2.47 ± 3.77 for the group with meaningful increases in anxiety from t1 to t2.

Discussion
The first aim of this study was to examine whether anxiety levels change over a 6-year period.While the small age differences in anxiety suggest that an additional six life-years should result in only marginal mean score changes, the t2 mean scores were nevertheless higher than those of the t1 measurement (d = 0.11).When comparing the mean values of t1 and t2, two differences should be considered.Firstly, the t1 examination was carried out at the study center, while at t2 the questionnaire was completed at home, and the completed t2 questionnaire was sent to the study center by mail.Second, the end of the t2 study period coincided with the beginning of the COVID-19 pandemic, and this may have increased anxiety levels in the general population.However, a systematic review and meta-analysis comparing mental health before versus during the COVID-19 pandemic in 2020 [41] found that the anxiety mean scores in the older adult general population increased only slightly or not at all due to the COVID-19 pandemic, a result that was also confirmed by further general population studies [42][43][44].Therefore, we do not think that the partial overlap with the COVID-19 pandemic had a substantial impact on the t2 results, though we cannot quantify this possible effect.In addition, there is no evidence that completing questionnaires during a laboratory study would generally overestimate or underestimate results compared with a postal survey.Thus, the results of this study suggest that anxiety levels really do increase over time.A similar result was also found in a previous study using the SF-36, that found out deteriorations in mental health over a 6-year period [45], though such age effects could not be detected when analyzed in a cross sectional study [46].This also means that cross-sectional studies with comparisons of age groups may be insufficient tools for predicting changes in QoL and mental health variables over time.
The GAD-7 proved to be a reliable instrument.While psychometric characteristics designed for cross-sectional studies have already been established in multiple studies, the current investigation adds that the items are also appropriate for detecting long-term changes, with coefficients of at least 0.36 for the association between item change and part-whole corrected scale change.Since this way of calculating coefficients for the reliability of change has not been applied to the GAD-7 before, it is not possible to compare the results with other investigations.
Measurement invariance was also established from the long-term perspective.While such measurement invariance has already been found in patient groups [27,28], this study adds that the longitudinal comparisons of GAD-7 examinations are also justified in the general population.Though there were differences between the t1 and the t2 examination concerning setting and possible partial impact of the COVID-19 pandemic, the results of the measurement invariance analyses show that the comparisons between the t1 and the t2 scores are fair.Beyond the mean score changes, the correlations between the t1 and the t2 scores indicate the degree of temporal stability of anxiety.Our results (r = 0.53 for the total sample) are in line with findings from the literature, where the following stability coefficients for different anxiety questionnaires and different time intervals are reported: r = 0.59 (3 years) [47], r = 0.60 (4 years) [48], r = 0.50 (5 years) [49], and r = 0.55 (6 years) [50].A study with a 3-months time interval found a higher stability coefficient of r = 0.65 [51], however, it is plausible that the stability decreases with increasing amounts of time elapsed between measurements [11,48].
What this study also adds is that there are no sex differences in the temporal stability of anxiety (r = 0.52 for males and r = 0.53 for females).Though women tend to report more anxiety and more emotional instability than men, their degree of fluctuation in this 6-year period is not higher than that of males.Regarding age, there were no clear age effects on the temporal stability.The lowest coefficient (r = 0.45) was observed for the youngest age group (� 39 years) which, however, should not be over-interpreted because of the relatively small sample size of that group.It is remarkable that despite the high increase in anxiety in the oldest age group, the predictability (r = 0.52) is nearly as high as that of the total sample.The consequence for the interpretation of long-term stability examinations in older patient groups is that increases in anxiety are common, but that the individual predictability of the future state of anxiety is not affected by old age.Though life changes and losses in older age (increasing health problems; loss loved ones) lead to a slight increase in the anxiety mean score level, this increase does not show more individual differences in the older group than in the other groups.
Living without a partner and being unemployed are associated with higher levels of anxiety [7].However, our results show that not having a partner or being unemployed does not predict changes in anxiety.That is, the difference in anxiety between employed and unemployed people neither increases nor decreases on a mean score level.
The relationship between anxiety and other QoL or mental health variables has already been examined in multiple studies.Our new analysis adds associations observable over the long-term perspective.Changes in anxiety were correlated with changes in the other variables, though the correlations of the change scores were lower than the correlations of the raw scores in the cross-sectional design, a result that has also been reported elsewhere [18].The highest change score correlations were those that also showed high cross-sectional correlations (e.g., correlations with the Mental health scale of the SF-8).Research projects that aim at distinguishing between certain types of clinical or health variables (e.g.fatigue and depression) can profit from such longitudinal correlation profiles.

Limitations
The response rate of the study was low (33% at baseline).Persons with poor mental health were probably underrepresented in the baseline assessment; a comparison between the 10000 participants of the t1 assessment and those non-participants who were nevertheless willing to answer several questions concerning their behavior and health state [52] found several differences between these groups.Compared with non-participants, participants in the study had the following characteristics: higher education, higher proportion of married and employed individuals, more non-smokers, and better physical health, e.g., with respect to cardiovascular disease and diabetes [52].Moreover, an analysis of the overall survival rate of the study participants showed that their survival rate was higher than the German average [53].Less than 60% of the t1 participants took part in the t2 examination, and we do not know the course of anxiety in those who dropped out.We also do not have information on reasons for non-participation at t2. Possible reasons are death, being too ill to take part in the t2 examination, and loss of interest after having obtained some health-related information already at t1, lack of time, or having moved.Such limitations in representativeness are common in epidemiological research.Nevertheless, they show that the mean scores of the t1 and the t2 examination should be interpreted with caution.

Conclusions
The results of this study provide a deeper insight into the conditions that are relevant for analyses of change in anxiety over several years.The GAD-7 proved to be a reliable instrument for longitudinal studies.